Natural Speech Dialogue Systems

نویسنده

  • Volker Steinbiss
چکیده

The application of speech recognition and understanding to real-life situations generates new scientific problems. To illustrate this, we give the example of an automatic switchboard system tha[ can be used in a very natural communication style, It is based on a high end speech recognize that ou[puts a word lattice which is subject to fur[her evaluation by a speech understanding and a dialogue component. Although word error rates are above 2070 for this task, the integration of additional knowledge sources like the telephone directory and the dialogue history can add to both the acoustic and the language model in order to achieve highly improved performance. [n order to improve the perceived dialogue quality, the level of confirmation is based on the probability that the request was understood correctly. INTRODUCTIONTO SPEECHDIALOG~ SYSTEMS Dialogue systems that allow a person to communicate with a computer just by natural speech have reached an impressive performance. Without changing the underlying large-vocabulary speech recognition systcm, which produces recognition results that are not error-free, it is possible to improve speech understanding accuracy by incorporating knowledge that has to do with the specific dialogue situation. We will Iine out a few of these techniques in this paper. THE PROBABILISmC FRA~IVORK FOR SPEECHUNDERSTANDING In speech recognition, the statistical approach has been widely accepted. It is based on the observation that speech is highly variable and influenced by factors that wc cannot [node] or control other than with a probabilistic dcscriptiorr. The statistical framework is the appropriate onc to tackle this “lack of knowledge” and takes advantage of a solid mathematical framework. In a nutshell, it can be described as follows: If a word sequence W is to be recognized from an observation O of acoustic vectors, that have been derived from a speech signal, Bayes’ decision rule suggests to take the ~ that maximizes (over all ~ the probability P(~O) = P(OI W)P(W)/P(0) or, equivalently, P(OI W)P( W). The scientific challenge of speech recognition is the estimation of the P(OI W), the socallcd acoustic model, from speech data, the estimation of the P(W),the language model, from [cxt data and maybe grammar knowledge, and the efficient optimization over the possible word sequences. The most straight-forward approach to speech understanding would be feeding the outpu( of a speech recognizcr into a conventional parser for natural language. Unfortunately, the word recognition errors that occur would induce an unsatisfactory performance in the parsing path. On the other hand, to understand the meaning of a spoken sentence, it is trot necessary to get each of the spoken words exactly right. It is thus desirable {o Ict the understanding gain some robustness against recognition errors and to hand over to the parsing comportcrtt some (maybe implicit) information on the lack of reliability and on alternative recognition hypotheses. This can indeed bc achieved by extending the probabilistic framework that had proven so useful in speech recognition to speech understanding. Without going into the details that can be found in [ 1,2], the model assumes that the speaker states a set of information items I by uttering a word sequence W that leads to an acoustic observation O of the recognition system. The methods which estimate the model parameters (training phase) and retrieve the most likely information item set (understanding) arc in the spirit of the statistical methods used in speech recognition. In particular, there is no unnecessary decision on the sentence level, i.e. it is not the speech rccognizer which decides on the hypothesis. In contrast, it delivers several hypotheses that are labeled with probabilities, e.g. via a so-called word graph or just via an n-best list of sentence hypotheses. The understanding grammar, taking into account knowledge sources above the word level (e.g. the dialogue history), decides on the most likely intcrprcta(ion. It is in our system

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling Lateral Communication in Holonic Multi Agent Systems

Agents, in a multi agent system, communicate with each other through the process of exchanging messages which is called dialogue. Multi agent organization is generally used to optimize agents’ communications. Holonic organization demonstrates a self-similar recursive and hierarchical structure in which each holon may include some other holons. In a holonic system, lateral communication occurs b...

متن کامل

Spoken dialogue systems

During the past decade there has been substantial improvements in natural language processing, speech recognition and text-to-speech conversion. Spoken dialogue is the missing component in between of these, making it possible to develop conversational interfaces to computers. In this paper I introduce the basics of spoken dialogue systems and present some applications using these techniques.

متن کامل

First-Order Logical Formalization of Dialogue Models for Natural Language Interface Systems

For the design of natural language interface systems, a number of models and theories have been developed as well as language processing systems. We propose a model of dialogue systems by a rst-order logical formalization based on meta logics. This model captures various concepts of dialogue systems: the concepts of software systems talking about such as time, actions, conditional and causality...

متن کامل

Speech Production in Human-Machine Dialogue: A Natural Language Generation Perspective

This article discusses speech production in dialogue from the perspective of natural language generation focusing on the selec tion of appropriate intonation We argue that in order to assign ap propriate intonation contours in speech producing systems it is vital to acknowledge the diversity of functions that intonation ful lls and to account for communicative and immediate contexts as major fa...

متن کامل

A Study of Human Dialogue Strategies in the Presence of Speech Recognition Errors

Continuous speech recognition technology has recently matured to the point where it has become feasible to develop spoken dialogue interfaces to computer systems that help people perform tasks in simple domains. ~Vord recognition accuracy for spontaneous dialogue, however, is still a long way from perfect, and any system that uses spoken natural language input has to have some mechanism for dea...

متن کامل

Studies on Robust Language and Dialogue Processing for Spoken Dialogue Systems

In spoken dialogue systems, robust language processing for spontaneous speech understanding and robust dialogue processing for achieving user goal are inevitable. Previously, research of speech recognition and research of natural language understanding were done independently. At first glance, it seems to be no problem to combine these two technologies, because the purpose of speech recognition...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998